The following analysis outlines the exploration of putative non-linear transformations, then details how the four-parameter transform was utilized.
We began by evaluating the use of cubic splines with a penalty score
encouraging monotonicity using a general additive model in the
mgcv package. We iteratively increased the number of
degrees from 1-5, and evaluated the fits using AIC to select the
strongest model for each landscape, then logged the number of degrees
that should be used for each transform.
We then visually evaluated the cubic spline transformations (in blue) for each landscape with the x-axis representing additive effects, as predicted by the first-order background-averaged model, and the y-axis representing observed effects.
From the fits we notice a certain degree of over fitting. One approach to alleviate this would be to arbitrarily lower the number of degrees, however instead we opted to use a different transformation method which is less prone to over fitting.
The drawback of using monotonic splines (or even power transforms), as can be seen from the plots above, is the lack of bounding. Bounding is crucial to avoid greatly transforming phenotype values that lay outside of the bounds of the transform, which itself is constrained by the range of the predicted first-order effects.
In other words, phenotypes with magnitudes that lay outside of the range of predicted values by the first-order model are at risk of being incorrectly transformed. Though this is somewhat true for a four-parameter model, the bounded upper- and lower- thresholds ensure that phenotypes that lay outside of the predicted range are likely restricted to the upper and lower bounds.
We attempted to transform all datasets using the four-parameter
function, with fitting performed by non-linear least squares regression
using nlsLM. The fits are compared to a simple linear model
fit using an AIC to determine whether the additional information
stemming from the four-parameter transform is parsimonious, otherwise,
no transform is applied.
## Error in nlsModel(formula, mf, start, wts) :
## singular gradient matrix at initial parameter estimates
## Error in nlsModel(formula, mf, start, wts) :
## singular gradient matrix at initial parameter estimates
Indeed, these fits appear much better visually than the spline fits, as they avoid over fitting to the datasets. We used these transforms in the subsequent analyses. Note: we removed landscapes TEM_growth_AMP, TEM_growth_AMC, TEM_growth_CAZ, and TEM_growth_TZP, as the four-parameter model was more parsimonious than the linear model, however the fits reduced all values in the landscape to binary values that represented the upper or lower bounds of the four-parameter model.